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Abstract 

Many problems of theoretical and practical interest involve finding an 

^dt, optimum over a family of convex functions. For instance, finding the 

projection on the convex functions in H k (Q), and optimizing functionals 

(""*) arising from some problems in economics. 

In the continuous setting and assuming smoothness, the convexity con- 
straints may be given locally by asking the Hessian matrix to be positive 

^4 semidefinite, but in making discrete approximations two difficulties arise: 

the continuous solutions may be not smooth, and functions with positive 

! /" , semidefinite discrete Hessian need not be convex in a discrete sense. 

Previous work has concentrated on non-local descriptions of convexity, 
making the number of constraints to grow super-linearly with the number 
of nodes even in dimension 2, and these descriptions are very difficult to 
extend to higher dimensions. 

1 i In this paper we propose a finite difference approximation using pos- 

itive semidefinite programs and discrete Hessians, and prove convergence 

"^ under very general conditions, even when the continuous solution is not 

J^i smooth, working on any dimension, and requiring a linear number of con- 

f*. straints in the number of nodes. 

iq Using semidefinite programming codes, we show concrete examples of 

_^ approximations to problems in two and three dimensions. 

1 Introduction 

Convex and concave functions appear naturally in many branches of science such 

^ as biology (growth), medicine (dose-response), or economics (utility, production 

K^ or costs), spurring in turn the interest of other areas, for example, statistics. It 

!_h is no surprise, then, that many problems of theoretical and practical interest 

involve the optimization of a functional over a family of convex functions. 

A particularly important case is when the functional to be minimized is the 
norm of some normed function space V, i.e., given / € V find 

min \\ u - f\\y 

where C is a family of convex functions in V. Typical choices are the spaces 
L 2 (51) or H 1 ^), and, in general, the W k ' p (fl) spaces, where fi is a convex 
domain in R d . 
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Sometimes the convexity is a reasonable shape assumption on the model, 
which could be replaced by or added to other shape constraints such as ra- 
dial symmetry, harmonicity or upper and lower bounds. This is the case, for 
example, of Newton's problem of minimal resistance (see, e.g., [3j|4j[12j[T3]). 

More surprisingly perhaps, the convexity constraint may be a consequence of 
the model, as in the design of some mechanisms in economics [TU [T7] . Actually, 
our interest in the subject arose from one of these problems, which we will call 



the monopolist problem, and is described in some detail in section 6.1 In this 
problem we wish to find 

max / (Vu(i) • x — u(x)) f(x) dx, (1-1) 

ueC Jq 

where Q = [0, l] d , f is a non-negative probability density function over Q, and 
C is a family of convex functions on Q with some further properties. 

The monopolist problem is numerically very challenging since, unlike the 
problems coming from physics, the space dimension d may be much higher than 
2 or 3. 

There is a big qualitative jump going from one to more dimensions when 
dealing with convexity constraints. This can be appreciated readily by look- 
ing at the statistics literature, where the one dimensional convex regression or 
density problem has been considered both theoretically and numerically, and 
using different measuring functionals (see, e.g., [TJ [5J HFJ ) , but very little has 
been done numerically in two or more dimensions. In this regard, it is worth 
mentioning the work by Shih, Chen and Kim [18J , who use appropriate splines 
in the MARS (Multivariate Adaptive Regression Splines) algorithm. 

One of the main difficulties in obtaining discrete approximations to optimiza- 
tion problems over convex functions lies in giving a local and finite description 
of convex functions if d > 1 . Though this could be done for smooth functions of 
continuous variables by asking the Hessian matrix to be positive semidefinite at 
all points, there is no similar characterization for discrete functions on meshes. 
As a matter of fact, we show in examples |2.3| and |2.4| that this cannot be done 
easily. 

It goes without saying that there is a lot of work done on convex functions 
of continuous variables, as exemplified by the classical book by Rockafellar [15] . 
Traditionally, this work leans more on properties satisfied by convex functions 
rather than on properties that imply convexity. In particular, very little has 
been done on local properties guaranteeing convexity. 

For discrete variables, there is a rather large work done by the discrete math- 
ematics community for fixed lattices, and a number of definitions for discrete 
convex functions have been proposed (see, e.g., [16]). Again, usually these defi- 
nitions are non-local. 

As far as we know, there exists very few literature on optimization problems 
on convex functions in dimensions two or more, either theoretically or numeri- 
cally. Besides those already mentioned (and the references therein), let us cite 
three more which are prominent in the context of our work. 

• Carlier and Lachand- Robert [5J obtained the C 1 regularity of a variant of 



the monopolist problem, substituting the functional in (1.1 1 for 
(-x |Vu| 2 + x ■ V« - u(xj) f(x) dx, 



under some restrictions on the domain f2 and the density /. They obtained 
also C 1 regularity for convex minimizers of functionals of the form 

(A(x, u)Vu(x) ■ Vu(x) + f(x)u(x)) dx. 



i> 



• 



Carlier, Lachand-Robert and Maury [5] proposed a numerical scheme for 
minimizers of functionals of the type 

j(x, u(x),"Vu(x)) dx, 

on closed convex subsets of convex functions of H 1 ^) or L 2 (Jl), where j 
is a quadratic function of u and Vu, and fi C M 2 . We will discuss their 
approach in the next section (after the inequalities ( |2.5| ). 

As Carlier et al. [B] point out, their work encompasses the problem of 
finding 

min / \u — f\ dx subject to u £ L (f2), u convex, u < /, 
Ju 

for given / e L 2 (fi), i.e., a L 2 -norm projection, and this problem is equiv- 
alent to that of finding the convex envelope /** of /. Thus, minimizing 
over convex functions and finding the convex envelope of a function are 
two quite related tasks. 

• Being one of the main problems in computational geometry, there are 
a number of well established codes for finding the convex hull of a set 
of points in M. d , which are very efficient in low dimensions. Hence, it is 
natural to try to use these codes to approximate optimization problems 
on convex functions, an approach which Lachand-Robert and Oudet |llj 
applied to several problems. 

It would be very interesting to see whether these ideas could be carried 
over, since the convex hull codes are quite fast in low dimensions. 

Our approach, based on semidcfinite programming, takes a different direction 
from those mentioned. 

Let us recall that a semidefinite program is an optimization problem of the 

form 

min c • x 

subject to Cl 21 

x x A x + • • • + x n A n - A h 0, 

xe w 1 , 

where c 6 R™, Aq,A 1: . . . ,A n are symmetric m x m matrices, and A y 
indicates that the symmetric matrix A is positive semidefinite. By letting the 



matrices Ai be diagonal, we see that the program (1.2) is a generalization of 
linear programming (and includes it strictly). Thus, in a semidefinite program 
the constraints can be a mixture of linear inequalities and positive semidefinite 
requirements. 

In this paper we give a theoretical framework for approximating many opti- 
mization problems on convex functions using a finite differences scheme which 



imposes a positive semidefinite constraint on a discretization of the Hessian 
matrix. 

Although not linear, our approach seems very natural and has many advan- 
tages. Being of a local nature, the number of constraints grows only linearly 
with the number of nodes, and it works for any dimension of the underlying 



space. Notwithstanding the already mentioned counterexamples (2.3 and 2.4) 
of the relation between convexity and positive semidefinite Hessian, we will show 
that for many problems we obtain convergence to a continuous optimum. This 
convergence holds even for non-smooth optima, as those arising when projecting 



in the L°° norm or in some problems of the type given by equation (1.1 1. 

In practice, our definition can be used to advantage by using the existing 
efficient semidefinite codes and, furthermore, it is very simple to program in 
higher dimensions. 

As a final remark, we think that our approach might be a first step to deal 
with other problems where convexity is a consequence of the model, such as 
transportation problems where the cost is quadratic, i.e., for solving the Monge- 
Ampere equations, a prime example of fully non-linear differential equations. 

This article is organized as follows. 

In section [2] we summarize some techniques that could be used to deal 
with the numerical approximation of optimal convex functions, showing their 
strengths and weaknesses, and introduce the discrete Hessian. 

In section [3] we give different characterizations of smooth convex functions 
of continuous variables, which allow us to extend the definition of the Hessian 
to C 1 functions in terms of averages. 

The heart of the paper is section [4] in order to have a good definition 
of discrete convexity, we need to show that any convex function of continuous 
variables may be approximated by its discrete counterparts, and conversely, that 
a converging sequence (in a suitable norm) of discrete convex functions will do 
so to a convex function of continuous variables. 

We show in this section that the approximation of the discrete functions to 
the limit u is uniform in u and its derivatives if u is smooth, and uniform in 
u over compact subsets of Cl, if u is merely convex and bounded (and hence 
continuous). Although our definition is stated using the Hessian, if a sequence 
of discrete functions converges to a function which is not C 2 or even C 1 , still 
convexity of the limit function is guaranteed. 

In section [5] we pose a general structure for optimization problems on convex 
functions which fits the results of the previous sections, and may be applied to 
several important problems. In addition, error bounds may be obtained if the 
continuous optimal solution is smooth. 

The results of sections [3j-[5] are somewhat independent of semidefinite pro- 
gramming. However, as explained in section 15] for many problems of interest 
the functional and constraints involved may be expressed as a semidefinite pro- 
gram, and we present in section [6] numerical examples showing that the current 
codes make it feasible to use our scheme on them. 

We begin section [6] by considering the monopolist problem in two and three 



dimensions, comparing our results to the analytical solution when / = 1 in ( 1.1 1 
We continue by showing some norm projections, exploring the behavior un- 
der different functionals of an example given by Carlier, Lachand-Robert and 
Maury [6 . Moreover, we exhibit an explicit example in which the L°° projection 
is not unique. We finish the section on numerical examples by showing how our 



scheme could be used to fit a (discrete) convex function to noisy data given on 
the nodes of a regular mesh, using the L 2 norm (i.e., a least squares approach), 
and also the L 1 and L°° norms, which are not often seen. 



2 Dealing with convex functions 

In this section we present some ideas and techniques that could be used to 
approximate numerically an optimal convex function. For the sake of simplicity, 
we will work on the unit <i-dimensional cube in R d , Q = [0, l] d . 
If d = 1 and 

x = < xx < ■■■ <x n = I, (2.1) 

is a discretization of [0,1], then a convex function u defined on [0,1] satisfies 
the inequalities 

UM-UJXJ-!) ^u( Xi+1 )-u( Xi ) i0Ti=1> ^ tn _ h {22) 

Xi Xi — 1 X-i-\-\ Xi 

Conversely, if a function u defined only in the mesh points {xq, . . . ,x n } 



satisfies the inequalities (2.2 1, we can always extend it linearly in the intervals 
[a;i_i,Xi], i — 1, . . . ,n, so that the resulting piecewise linear function will be 
convex on [0,1]. 



Thus, the inequalities (2.2) may be used to define discrete convexity in one 
dimension. It is clear that — via the piecewise linear extension — we can approxi- 
mate any convex function in [0, 1]. On the other hand, as we show in lemma [2Tl] 



under some adequate assumptions if a sequence of discrete convex functions (sat- 



isfying (2.2 1 for each corresponding subdivision) converges pointwise, then the 
limit defines a unique convex function in [0,1]. 

Let h be given, < h < 1/2, and suppose M. — {xq, ■ ■ ■ , x n } is a subdivision 



of [0, 1] satisfying (2.1l and such that 



max \xi — Xi--\\ < h. (2.3) 

Ki<n 



Let u be defined on M. and satisfy (2.2). For a given e, h < e < 1/2, let us 
consider 

x = max {x G M. : x < e} and x = min {x € M. : x > 1 — e}. 

From (|2.2|) we see that 



\u(x) - u(0)\ u(x) - u(0) u(x i+1 ) - u(xj) u(l) - u(x) \u(l) - u(x)\ 

for all Xi,Xi+i € M. such that x < Xi, Xi+i < x. Therefore, if 

|u(xi)| < M for all Xi G M, 
we have 

C< "(^+1) ~ "(^) <c for all e < Xi , x i+1 < 1 - e, (2.4) 

Xi-\-l Xi 

where C and C are constants depending only on e and M (but not on h or A)). 
In other words, if u is extended as a piecewise linear function to all of [0,1], the 
resulting function is uniformly Lipschitz on compact subsets of (0, 1). 
Hence, 



Lemma 2.1. Let (M.h)h be a sequence of meshes in [0, 1], h J, 0, satisfying ( |2.3| 
for each h, and such that Mh C Mh< f or h > hi . Suppose Uh is defined in Mh 
for each h, and the sequence (uhjh is such that for all x £ UhMh, lim/i W/j(a;) 
exists and is finite, and let 

u(x) = limuh{x) for x £ UhMh- 

h 

Then u may be extended to all of [0, 1] as a continuous convex function in a 
unique way. Moreover, the convergence of Uh (extended piecewise linearly) to u 
is uniform in compact subsets of (0,1). 

Also, by the Arzela-Ascoli theorem, 

Lemma 2.2. Let (Mh)h be a sequence of meshes as in the previous lemma. 
Suppose M > is given, and that for each h a function Uh is defined in Mh so 
that 

\ u h(x)\ < M for all x £ Mh ond all h, 

and assume Uh is extended to all of [0, 1] piecewise linearly. 

Then, there exists a subsequence of {uh') of (uh) converging to a continuous 
convex function u defined on [0, 1]. Moreover, the convergence of the subsequence 
(lih 1 ) to u is uniform in compact subsets of (0,1). 



Thus, from a theoretical point of view, the discrete functions satisfying (2.2 1 
are well understood. 

From the numerical point of view, when used in a discretization of an opti- 



mization problem, the constraints coming from the inequalities (2.2) are linear, 
and solving the resulting discrete optimization problem with them is usually 
not much harder than solving it without them. 

Hence, the case d — 1 poses no major trouble. 

For d > 1 it will be more convenient to work only with regular meshes (or 
grids) on Q = [0, l] d . Thus, for fixed h > (h = 1/n for some n € N), the 
mesh Mh will consist of all points x £ M. d n Q of the form x = hz with z £ 7L d . 
Denoting by Q — Q \ dQ the interior of Q, we set 

M h = M h n Q, dM h = M h n dQ, 
and denote by Uh the set of real valued functions defined an. Mh- 



A first simple idea to extend the inequalities (2.2 1 to more dimensions, is to 



consider the set of functions Uh £ Uh satisfying the convexity constraints 

Uh(x + hei) + Uh(x — hei) > 2 Uh(x) for all x £ Mh and i = 1, . . . , d, 

where e* denotes the i-th vector of the canonical basis of R d . 

This set of discrete functions is very appealing because the convexity is 
modelled by using only 0(number of mesh points) linear constraints, as in the 
case d = 1. As we have seen, this approach is exact in one dimension, and gives 
satisfactory results in many cases in more dimensions (in particular for some 



of the specific examples treated later in section 6.1 1, but there is no guarantee 
of convexity in the limit function if d > 1. In fact, by taking the interpolant, 
with this set we can certainly approximate any function of continuous variables 
which is convex, but we can also approximate other functions. For example, we 



may approximate u(x\,X2) = x\Xi which is linear and thus convex — in each 
coordinate direction, but not convex in Q C I 2 . 

This shows that the definition of discrete convexity should be done with 
more care: when d > 1, we have to take into account every possible direction. 

It is reasonable, then, to say that a discrete function is convex if 

Uh{x + y) + Uh(x — y) > 2uh(x) for all x, y such that x ± y £ Mh- (2-5) 

Since as h goes to the possible directions become dense, it is possible (under 
some conditions) to regain convexity in the limit as we had for one dimension. 
In a way this is the approach followed by Carrier, Lachand-Robert and 
Maury [BJ. In a two dimensional setting, they consider discrete convex functions 
which are restrictions to a mesh of convex functions of continuous variables, and 
show that this definition is equivalent to an intrinsic one, stated only in terms of 



the value of the function at the grid points, similar to (2.5 1. The problem with 
this description is that it is non-local, and the number of constraints needed in 
two dimensions (after pruning) reportedly grows approximately as iV 18 , where 
N is the number of nodes in the grid. Moreover, this approach is very difficult 
to extend to higher dimensions. 

In order to keep the definition of discrete convexity local, another possibility 
is to consider discretizations of the Hessian matrix, as we do in this work. 

For u £ lAh and i£Mf,,we define the (forward) first order finite differences 

by 

u(x + hei) -u(x) . 

A hti u{x)= , i = l,...,d, 

and the second order finite differences by 

iir 

or, if i ^ j, by 



■2 u(x + h&i) — 2u(x) + u(x — hei) 



4/i 2 



u(x + hei — hej) + u(x — he,i — hej)). 



Clearly, not all of these finite differences are defined for points in dM.y L . 
Therefore, when mentioning Ahju(x) or A hi ju(x) for x £ Aih, we will im- 
plicitly assume that x and i (and eventually j) are such that the corresponding 
finite difference is well defined at x. 

Finally, we define the discrete Hessian of u £ Uh at x £ Mh, as the symmetric 
matrix Hhu(x) £ 'R dxd whose i, j entry is A^ 4 - u(x). 

We are faced now with two issues: 

1. How can a discrete optimization problem with positive semidefinite dis- 
crete Hessian constraints be solved? 

2. Once we found a discrete solution, how does it approximate the continuous 
solution? 

As mentioned in the introduction, for the first issue we will use the semidefi- 



nite programming model ( 1.2 1, where the objective is linear and the constraints 



are positive semidefinitc. Thus, if the objective of the continuous problem is 
not linear, we must rewrite it as a constraint. However, it is worth mentioning 
that not every functional may be readily modelled as a positive scmidcfinitc 
constraint, an example of such functionals being 

dx, 



fi 1 + IVmI 2 



arising in Newton's problem of minimal resistance. 

Although this method may not be used for all functionals, it can be used in 
many cases as we now illustrate. The interested reader is referred to the article 
by Vandenberghe and Boyd [20] for other possibilities. 

In what follows we associate with each mesh node P&, k = 1, . . -,N, the 
unknown value u k , the discrete Hessian Hh(P k ), and a d-dimcnsional cube Q k 
centered at P k of measure \Qk\, so that J^ k \Qk\ = \Q\- 

L 1 projection. The program 



/ \u — f\ dx = min N^ / \ u — f\ dx, 



mm 

lQ " JQk 



subject to u convex, is modeled by adding N more unknowns t\, 
(for a total of 2N) as 



'■JV 



in^tfe 



mm 

k 
subject to 



\u-f\dx<t kl k=l,...,N, 
H h (P k ) b0, k=l,...,N. 



In turn, the constraints of the form 



/ \u- f\dx <t k , 
JQk 



arc lumped as \u k — /(-Pfe)| \Qk\ < tk, and finally modeled as the linear 
inequalities 

-u k \Qk\ + t k >-f{P k )\Qk\, 
u k \Qk\ + t k >f(P k )\Q k \. 



2 projection. This is analogous to the L 1 projection: 




min ^2 tk 

k 




subject to 




(u k -f k ) 2 \Qk\<t k , k = l,. 


.,N 


H h (P k ) h0, k = l,. 


.,N 



and the constraints of the form (u k — fk) 2 \Qk\ < tk are written in the 
form 

tk {xk - fk) s/\Qk\ 

{x k - fk) V\Qkl i 

The H 1 projection is handled in a similar way. 



^0. 



L°° projection. This is analogous to the other projections, except that we only 
need to add one more variable t (instead of N), obtaining the discrete 
program 



min t 



u k + t>~f(P k ), 


k = 1,. 


.,N 


u k + t>f(P k ), 


fe = l,. 


.,N 


h(P k ) h o, 


fc = l,. 


.,N 



subject to 



The next issue is to see how well the discrete solutions of the semidefinitc 
program approximate the continuous convex solution. 

Our first hope is that Hhu(x) >; will be equivalent to some version of 
convexity of discrete functions, for example to the one given by the inequali- 



ties (2.51. The next examples show that this is not true. The first one shows that 



a non-convex function may have a positive semidcfinite discrete Hessian, and 
the second one shows that the discrete Hessian may not be positive semidefinitc 
for some convex functions. 

Example 2.3. Let us consider d = 2, and u defined on the 2-dimensional 2x2 
grid Mi/2 by 



u(0,l) = l, u(l/2, 1) = 1, u(l,l) 

u(0, 1/2) = 1/2, u(l/2, 1/2) = 5/8, w(l,l/2) 

u(0,0)=0, u(l/2,0) = 1/2, u(l,0) 



We may check that (with h = 1/2), 
H h u{l/2, 1/2) 



1 -1 
-1 1 



whose eigenvalues are 2 and (with eigenvectors (1,-1) and (1,1)), and is thus 
positive semidefinitc. 

However, the restriction to the diagonal x\ — X2 is not convex: 



u(0,0) + u(l,l) 



~<!=«(l/2,l/2). 







Example 2.4. Consider the same grid of the previous example and u defined by 

u(0,l) = 8/15, u(l/2, 1) = 8/15, u(l,l) = l, 

u(0, 1/2) = 1/30, u(l/2, 1/2) = 1/2, u(l,l/2) = l, 

w(0, 0) = 0, u(l/2, 0) = 1/2, u(l, 0) = 1. 

We see that u is the restriction to M.h of the convex function 



14ici 



max \ x\, 



x 2 



15 



but 



H h u(l/2, 1/2) = 



15 



-,x 2 - 

2 
-23 



-23' 

2 



which has eigenvalues 5/3 and —7/5, and hence is not positive semidefinite. 



Notwithstanding these examples, in section [4] we will show that, under suit- 
able conditions, by asking for positive semidcfinite discrete Hessians we may 
conveniently approximate continuous convex functions, and obtain variants of 
lemmas ED and E2l 

However, we will need some previous results which are tackled in the follow- 
ing section. 

3 Convex functions of continuous variables 

In this section we deal with variants of the Hessian matrix, initially defined for 
C 2 functions, which allow us to characterize convexity for C 1 functions locally. 
Let us start by introducing some more precise notation: 

• The distance from a point a; to a set A will be denoted by dist(x, A). 

• If x £ R d and h > 0, Q h {x) = {y £ R d : \y 4 - x,\ < h for alH = 1, . . . , d}. 

• The gradient of u is denoted by Vu = {d^u, . . . , d d u), and we write 9? for 

aft. 

• If £1 C M. d is an open set, we denote by C k (Cl) the set of functions having 
all derivatives up to order k continuous on ft, and by C k (Q) the set of 
functions having all derivatives up to order k uniformly continuous in CI. 
If Q is omitted, C k = C k {Q). 

Let ip be a non-negative, real valued function in C^°, vanishing outside 
{x £ M. d : \x\ < 1}, and such that J Rd ip(x)dx = 1, and for e > define 
(f e (x) — e~ d <p(e~~ l x). The function 



u e (x) =m* tp e (x) = / u(x-y)<p e (y)dy, (3.1) 

defined for functions u and points x whenever the right hand side makes sense, 
has many interesting well known properties, and we state some of them without 
proof, referring the reader to, e.g., the book by Ziemer [22, Theorem 1.6.1 and 
Remark 1.6.2]. 

Theorem 3.1. 

1. Ifu £ L} oc (R d ), then for every e > 0, u e £ C°°(IR d ) andd a u e = (d a <p e )*u, 
for every multi-index a, where for a — (aj., . . . , ay)> ® a ~ &1 1 ■ ■ ■ ^d d • 

2. If u £ L l (Jl) then u £ (x) is defined for x £ il and dist(x,dfl) > e. 

3. lim e ^o u £ (x) = u(x) whenever x is a Lebesgue point of u. 

4- If u £ C(0), and F is a compact subset of CI, then u E converges uniformly 
to u on F as e — > 0. 

The functions u e , also called regularizations or mollifiers of u, will make us 
work often with the sets 

A e — {x £ Q : dist(x, dQ) > e}, 
10 



where e G M, < e < 1. 

Our first result is a simple characterization of continuous convex functions 
using the regularizations u e , and we omit its proof. 



Lemma 3.2. Suppose u G C and u e is defined as in (3.1). We have, 



1. If u is convex, then u e is convex in A e for all e > 0. 

2. Conversely, if for a sequence of e's converging to 0, u e is convex in A e , 
then u is convex. 

The Hessian of u G C 2 at x is the symmetric matrix Hu{x) whose ij en- 
try is dfju(x), and convex functions in C 2 are characterized by the positive 
semidefinite condition 

Hu{x) y for all x <E Q. 

Also, for u G C 2 , and 5 > 0, the average 

W ' u W = 7*kd I H u(v)dy, (3.2) 

(2d) d J Qs{x) 

defined for x G As, converges to Hu(x) for all x G Q, as S J. 0. Thus the 
condition 

Hsu(x) >z for all x and <5 such that x G A5, (3-3) 

implies the convexity of u. Actually, since the average of positive semidefinite 



matrices is positive semidefinite, (3.3 1 is equivalent to convexity for u € C . 

The definitions of Hu and Hsu involve second order derivatives which are 
not defined if u is not smooth enough. However, we may express Hsu merely 
in terms of u and Vu integrating by parts, and the resulting formula will make 
sense for u € C 1 . Thus, for u G C 1 and x G As let us define Hsu(x) as the 
symmetric d x d matrix whose diagonal entries are 

(H s u(x)).. = (2S)- d ( / d t u(v) dy- f d iU (y) dp) , (3.4a) 

and, if i ^ j, 

(Hsu(x)) = (2S)- d ( / u(y) dy - f u{y) dy 

u(y)dy+ / u(y)di/), (3.4b) 

^Ti(^)nF 5 +(x) J^.Wn^W ' 

where FrAx) denotes the d — 1 dimensional face of the cube Qs(x) having 
outward normal ±ej. 

Of course, since the equations (3.4 1 were obtained integrating by parts, we 
have: 

Lemma 3.3. If u G C 2 then Hsu — Hsu. 

As we show now, the extension Hs of Hs to functions in C 1 , still gives a 
local characterization of convexity. 



11 



Theorem 3.4. The following are valid if u £ C 1 : 

1. If u is convex, then Hsu(x) y for all S > and x £ Ag. 

2. If for a sequence S n J, 0, Hg n u(x) y for all x £ Ag n , then u is convex. 



Proof. Suppose u £ C , and consider the regularization u £ denned in (3.1 1. 
Since we can interchange derivatives and integrals with the convolution, 

Hgu e (x) — Hs(u * (p £ )(x) — (Hgu) * <f E (x) for x £ Ag +e . (3.5) 

If u is convex, by the first part of lemma [3T2| we know that u s is convex in A F , 



and since u e £ C°°, for x £ A$ +e we have H$u e (x) y 0. Thus, by lemma 

H s u £ (x) y for x £ As +£ . 



3.3 



Using theorem 3.1 since u e and their derivatives converge uniformly to u in 
compact subsets of Q, we must have Hsu(x) y for x in compact subsets of 
As, and therefore in the whole of A$. 

On the other hand, if H$u(x) y for x £ Ag, since an integral mean of 



positive scmidcfinitc matrices is positive scmidcfinitc, using ( 3.5 ) and lemma 3.3 
we see that 

H s u £ (x) = H s u e (x) y for x £ A e+S , 

which implies that u £ is convex in A e+ $. Letting S go to while keeping e fixed, 



we sec that u e is convex in A e , and by the second part of lemma 3.2 u must be 



convex. □ 

4 Approximating convex functions, the discrete 
Hessian 

There are two main issues when defining the set of discrete approximants to be 
used: 

1. we want it to be rich enough to approximate every convex function, and 

2. we want this set to be not too large, to avoid convergence to non-convex 
functions. 

The first point is very natural, and necessary to approximate the solution 
to the problem. The second point might look artificial at first sight, but if not 
enforced, then we could be approximating a non-convex function. This is the 
case, for instance, if we only require convexity along the coordinate axes, as we 
stated in section |2] 

Our discrete approximations will be a subset of functions having positive 
semidefinite discrete Hessians, and the main purpose of this section is to address 
the two issues mentioned above. 



We will show in theorem 4.2 that, despite the examples 2.3 and 2.4 the 
discrete Hessian Hh may be used to obtain very good approximations to convex 
functions of continuous variables. 

This is not surprising since we can approximate, say, u £ C 3 , by discrete 
functions whose finite differences up to order 2 converge to the derivatives up 
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to order 2 of u, and then add a small perturbation to bring up the eigenvalues 
of the discrete Hessians. This is the main idea of the proof, and in its course, 
we will use the following part of the Hoffman- Wiclandt theorem [10] . 

Theorem 4.1 (Hoffman and Wielandt). There exists a positive constant Cd, 
depending only on the dimension d, such that if A = [ay] and B = [bij] are 
symmetric d x d matrices, and X and /j, are their minimum eigenvalues, then 

I A - fi\ < c d max \a, LJ - b i2 \. 
y 

Theorem 4.2. Let u € C 3 be convex. Then, for any e > 0, there exists 
ho = ho(u,e) > such that for any h, < h < ho, there exists a function Uh 
defined on M.^ satisfying for all x G M-h, 



\u h (x) -u(x)\ +J~]\Ah,jUh(x) - d t u(x)\ 



£ \Al. j u h (x)-df j u(x)\<e, (4.1) 

l<i<j<d 



and 

HhUh(x) y for all x € Mh- 



Let us recall that in the inequality (4.1 ), for x G dM.t we consider only the 



finite differences that are defined at that x. 

Proof. If u is convex and smooth, given e\ > there exists S\ > such that for 
all x € Q and < h < Si, 

d 

Y,\&h, l u(x)-d l u(x)\+ J2 l A Mi«(*)-4-«( a: )l < ^' 



where Cd is the constant in theorem 4.1 Hence, since u is convex and therefore 



Hu(x) >z 0, the minimum eigenvalue of Hhu(x) is uniformly bounded below by 

-£l- 

Now consider the function 

SO) = 2 l^l 2 ' 

for which, if h small enough, 

\g(x)\<d/2, \A fl!i g(x)\=\x i + h/2\<2 ) and H h g(x) = Hg(x) = I d , 

where Id is the identity matrix in IR dxd . 
If < h < min{ei, S\}, the function 

u h = u + eig, 

defined onM/,, satisfies 

H h u h (x) y for x e M hl 

\u h {x)-u(x)\ < ~y, 
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and 



J2\&h,iU h (x) - d iU (x)\ + J2 \^l,H u h{x)-d%u{x)\ 

i=l l<i<3<d 

d d 

< Y, \A Kl (u h (x) - u(x))\ + Y, l A M u(x) - d iU (x)\ 

+ E \^hj( u h(x)-u(x))\+ z \^l,i3< x )- d K x )\ 

l<i<j<d l<i<j<d 

< 2de x +de x + — . 

Cd 



Thus, for some constant c' d depending only on d, the inequality (4.1 1 holds 
with e replaced by c' d £\. 

The result follows now by taking E\ and h appropriately. D 

The main implication of theorem |4.2| is that any smooth convex function is 
a limit of a sequence of functions with positive semidcfinitc discrete Hessians, 
giving an affirmative answer to the first issue mentioned at the beginning of 



this section. Moreover, an application of lemma 3.2 implies this result also for 
non-smooth convex functions, with convergence in || • ||o on compact subsets of 
VI (see definitions below). 

We would like to show next that our set of approximants also solves the 
second issue. That is, if we have a convergent (in certain norm) sequence of 
functions with positive semidefinite discrete Hessians, then the limit is convex. 
On the other hand, if the sequence is not convergent, we would like also to 
understand under which further conditions on the sequence we may extract a 
subsequence converging to a convex function. 

In what follows, we will work with sequences of functions defined on finer 
and finer meshes, that is, sequences S — (w/ i7l ) neN with Uh n € Uh n for every n, 
such that h n /h n+ i <G N and (h n ) n decreases to 0. We will denote by S the set 
of all such sequences, and use the notation 

M(S) = U n M hn , M(S)=M{S)nQ, and dM(S) = M(S) n dQ. 

So as not to clutter even more the notation, usually we will drop the index n, 
writing S = (uh) for 5e5, when this does not lead to confusion. 
If S = (uh) € S and u : Q — ► R, we will say that 

lim Uh{x) — u(x) uniformly for x € M(S), 

if for any e > 0, there exists ho = h (e) > 0, so that \uh(x) — u(x)\ < £ for all 
h < ho and x € M.h, that is, max xeJ vf h \ u h(x) — u(x)\ — > as h — > 0. In this 
case we will write, with a little abuse of notation, 

lim \\uh - u\\* n = 0. 

If in addition mgC 1 , we write, similarly, 

lim \\uh - u||, = 0, 
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to indicate that lim/ w o \\ u h — u\\ = and 

lim A hi u h (x) = d t u{x) 

h->0 

uniformly for all x G M(S) for which the finite differences make sense and 
i = 1, . . . ,d. 

Lemma 4.3. Suppose S — (u/j) € S and u G C 1 are such that 

lim \\u h - u\\l = 0, 

and 

H h Uh(x) y for all x G M(S). 

Then u is convex. 

Proof. We follow closely what was done in section [3] using a variant of the 
divergence theorem for discrete variables, so that only approximations to the 
function or its derivatives — but not the second derivatives — are needed. 
For x G Aih' an d h <C hf, let us define 



H w.h uh(x) = ^H h u h {y), 



where the sum is over all y € M.h such that Qh(y) C Qh'(x). 

Since the sum of positive semidefinite matrices is a positive semidcfinitc 
matrix, we have 

H* h , h u h (x) >z for all x G Mw ■ 

In one dimension we have just one entry in H^, h , involving a term of the 
form 

u(x -h')-2 u(x - b! + h) + u(x - h' + 2h) 

+ u(x - hf + h) - 2 u(x - h' + 2h) + u(x - hi + 3/i) H 

+ u(x + h! - 2h) - 2 u(x + ti - h) + u(x + h') 

= (u(x + h') - u(x + h! - h)) - (u(x -h' + h) - u(x -h')). 

That is, if d = 1 then 

H* hlJl u h (x) = [i (A,, u h (x + h'-h)- A h u h (x - h'))] . (4.2) 

If d = 2 and x = (#i,a;2), the diagonal entries are similar to the one dimen- 
sional case. For instance, 



H^^u^x))^ = - [S^ls. h ^u h (xi + ti - h,x 2 + kh) 



k 



-J2 A h,iu h (xi-ti,x 2 + kh)y (4.3) 
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where both sums are on k's such that —h'<kh<h'. On the other hand, the 
off-diagonal terms are of the form 

(H* h , h u h (x)) 12 = — ((uh(x + a) +Uh{x + 0) + u h (x + j) + u h (x + S)) 

- (uh(x + a') + Uh(x + /?') + u h (x + 7') + u h (x + 5')) 

- {u h (x - a 1 ) + u h (x - (3 1 ) + u h {x - 7') + u h (x - 5')) 

+ (u h (x - a) + u h (x - 0) + u h (x - 7) + u h (x - <5)) J , (4.4) 
where 

a= (ti - ft) ei + (ti - h) e 2 , a' = (ti - h) ei - (/i' - ft) e 2 , 

/3 = /i' ei + (ft' - ft) e 2 , 13' = ft' e x - (ft - ft') e 2 , 

7 = (ft' — ft) ei + ft' e 2 , 7' = (ft' — ft) ei — ft' e 2) 

5 = ft'ei + ft' e 2 , <5' = ft' ei — ft' e 2 . 



For d > 2 we get similar expressions to those of equations (4.2 1 and (4.3 1 



for the diagonal terms, and (4.4 1 for the off-diagonal terms, except that — as 
when going from (4.2 1 to (4.3 1 — they must be summed over mesh points on 
d — 1 dimensional surfaces perpendicular to the (say) i-th direction, or d — 2 
dimensional surfaces perpendicular to the (say) ij plane. 

This was also the case in the equations ( |3.4| ) for the continuous case, and so 
we can think of these sums as approximations to those integrals, where small 
d-dimcnsional cubes of side length ft centered at mesh points have been used. In 
order to do this we must multiply the entries by h d so as to obtain the correct 
dimensions. 

In fact, since we are assuming 

lim \\u h - w||i = 0, 

h-*0 

and H^, ^u^(x) involves either no finite differences (off the diagonal) or only first 
order differences (in the diagonal) which converge uniformly to (respectively) u 
or its first derivatives, 

lim ft iJ,*/ h Uh{x) = Rh'u(x) for all x S Mh', 

where 

Rh'u(x) t 0, 

since the limit of positive semidefinite matrices is positive semidefinitc. 

On the other hand, due to the convergence of Uh to u in the || • ||* norm, we 
can see that, as ft j 0, 

(h d H* h , ih u h (x))..^ f d iU (y)dy- f d lU (y)dy, 

(h d H* h>Jl u h (x)) - ( f u(y) dy - f u(y) dy 



u{y)dy+ I u(y)dy), 

JF-^nF-^x) > 



1G 



if i =/= j- Thus, 

R h >u(x) = {2ti) d H h ,u{x) for x £ My, 

which implies H^u^x) >z for all x £ M-k' ■ The result now follows from 
theorem 13.41 □ 



The convergence in || -||* should be guaranteed by the definition of the discrete 
problem and is problem dependent. We make now a general assumption on an 
algorithm to ensure convergence in [J - J J -^ . 

Let us denote by A 1 the space of Lipschitz continuous functions defined 
on Q, recalling that A 1 = W 1 ' 00 , the space of continuous functions having 
derivatives in the weak sense up to order one bounded. If K > 0, we let A^ 
be the set of functions u £ A 1 such that both \u(x)\ < K for all x £ Q and 
\u(x) — u(x + t&i)\ < Kt for all z = 1, . . . , d and t > whenever x, x + £e; £ Q. 
Of course, the condition \u(x) — u(x + te»)| < Kt is equivalent to |9jtt(a;)| < K. 

Similarly, let us denote by A 2 = W 2 '°° the space of functions whose first 
derivatives are in A 1 , and by K 2 K the set of all functions in A 2 which have all 
weak derivatives up to order 2 bounded by K. In particular, we have A 2 C C . 

By analogy to the continuous case, let us define for h and K positive the 
following spaces of discrete functions: 

A °h,K = { u ^Uh ■ \u(x)\ <K,x£M h }, 
A h,K = i u e A° h ,K '■ |4,i«W! <K,x€M h ,i= l,...,d}, (4.5) 

Aft jif = {u £ Ai iK : \A 2 Kij u(x)\ <K,x€M h ,i,j=l,...,d}, 

with the understanding that on the right hand sides we take A^^ u(x) or A h ^ 
for all x,i and j where it makes sense. 

The following result is a version of the Arzela-Ascoli theorem, and we omit 
its proof. 

Theorem 4.4. If S = (uh) £ S for which Uh £ A\ K for all h > 0, then there 
exists u £ Ajf and a subsequence S' = (uh') of S such that 

lim \\u h > - u\\l = 0. 
h' — *0 

Combining the previous results, we have: 
Theorem 4.5. If S — (uh) £ S is such that 

u h G A-l >K and H h u h (x) > 

for all x £ A4(S) and h > 0, then there exists u £ K 2 K , and a subsequence 
S' = (iih 1 ) of S such that 

lim \\uh> —wlli =0 and u is convex. 



Proof. Applying theorem 4.4 to the functions A^, perhaps on some smaller 
meshes M.\ h with M h c M.' ih C Mh, we may find functions m £ A^- and a 
subsequence of S, S 1 = (uh 1 ), such that 

lim \\A h ' ti Uh' - Ui\\l = for i = l,..., d 
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To show that there exists u such that Ui = c^w (in the classical sense), we 
define for x = (x\, . . . , Xa) £ Q, 

d d rX ■ j— 1 

i=i j=i j° i=i 

Since for i = 1, . . . , d, Uj is bounded (by -ftf) and continuous, u is well defined 
and continuous. Using that Ah'jUh' converge uniformly to Uj, for a; G Ai(S') 
we may write 

d Xj/h' j-l 

u{x) = lim Uh'(0) + ^2 5Z ^' ^ h ' '•?' Uh ' (^^' e J +y"^ e i 

j"=l fej=0 i=l 

But, for a given ft' and all x € M.h> we have 

rf Xj/h' j-l 

u/i' (0) + X! X! h ' ^ h ''i Uh ' \ k J h/ e i + X! Xl e ' 

j=l kj=0 i=l 

i J 

j — 1 i= 1 i— 1 



(0) + ^ f Mfe' ( ^ a?t ej j - uw (y~] Xj eA j =Uh'{x), (4.6) 



and therefore 



lim \\uh> - u||n = 0. 
w — >0 



Arguing as in equation (4.6 1 and taking limits, we may also verify the "in- 
dependence of path" , that is, for x £ M.y and ft' < h' we may write, 

f' 1 ' 
u(x + ft e.i) — u{x) = / Ui(x + tt ej) di; for all i = 1, . . . , d. 

Jo 

Using the continuity of u, we see that the last equation is valid for all x £ Q 
and ft' small enough, and, using once more the continuity of u, that 

f s 

u[x + 5 ei) — u[x) = I Ui(x + ti ej) dti for all i = 1, . . . , d, 



■-/<) 
for all x £ Q and all 6 > 0, and therefore 

d i u{x) = tij(x) for all x £ Q and i = 1, . . . , d. 



The convexity of u follows from lemma 4.3 □ 



Remark 4.6. If m„ is a sequence of convex functions converging pointwise to 
u, then u must be convex. However it is not true in general that the second 
derivatives or the Hessian of u n will converge to that of u, even if we have 
uniform convergence of u n and its derivatives to those of u. In this sense, 
theorem 14.51 cannot be bettered too much. 

For example, consider in d = 1 the functions 

x 2 
u{x) = —, 
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4cos(7mx) 



and, with h — 1/n for n G N, 

. . . . cosiimx) 

U h {X) = U{X) + — "2—. 

(nir) 
We have, 

Uh{x) > 0, lim ft _o Uh(x) = u(x), 

sin(7mir) 
< u'fAx) =u'{x) < 1, lim/ l _ u h (x) = u (x), 

WK 

where the limits are uniform in x. Also, 

Uh(x) — u"(x) — cos^imx) (> 0), and &*Uh(x) = 1 - 

vi - 

Taking n = 2 J+m , x = ~ = ^"^, for m > we obtain 

.9 , x 4cos(vr2 m i) 4 

AWz) = 1 ^ - = 1 - — w 0.594715, 

and so the second order differences at dyadic points converge to this constant 
value. However, also at these points, u"(x) = 1. 

On the other hand, we may weaken the conditions on convergence, following 
what was done for the one-dimensional case in section [2j 

If Uh is defined in Aih and HhUh(x) >z for x € M-hi its restriction to 
the one-dimensional line obtained by fixing all coordinates except the i-th one, 
satisfies 

u{x + ihei) — u(x + {i — 1) hei) u(x + (i + 1) Tie,-) — u(x + i/ie,-) 
ft - ft ' 

for i — 1, . . . , d, since these arc coefficients in the main diagonal of H^u^x+hei). 



That is, the inequalities (2.2 1 are satisfied, and hence if for some M > 0, 



|u/i(a:)| < M for all x G 7W/ l; 

then for any given e > we may find a constant C, depending on M and e but 
not on u, such that 

|A/j^ m(x)| < C for all a; G .M/i such that dist(x,9(5) > e. 



Therefore, by applying theorem 4.4 to d-dimcnsional cubes contained in Q, 



we may strengthen lemma 4.3 to obtain a variant of lemma 2.1 
Corollary 4.7. Let S = (uh) G S such that 

H h u h (x) >: for all x G M(S), 
and suppose u is a (finite) function defined on M(S) satisfying 
limuh(x) = u{x) for all x G A4(S). 

Then, the convergence of Uh to u is uniform on compact subsets of Q, and u 
may be uniquely extended to a convex function defined on all ofQ. 
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Similarly, we may weaken the conditions of theorem |4.4| to obtain the fol- 



lowing version of lemma 2.2 



Corollary 4.8. If M > and S = (v,h) £ S are such that 

\uh{x)\ < M for all h and x £ Mh- 

Then there exists a continuous convex function u defined on Q and a subsequence 
S' = (lift/) of S such that 

lim Uh'(x) = Uhix) for all x £ Ai(S'). 

fc'->0 

Moreover, the convergence is uniform on compact subsets of Q. 

We conclude this section with some comments. 

Remark 4.9 (boundary behavior). If u £ C 2 , dfju(x) is defined initially for 

x £ Q, but the very definition of C 2 as the set of those functions in C 2 (Q) 
having second order derivatives uniformly continuous on Q, makes it possible to 
define df^u(x) for x £ dQ by a limiting argument. And the same goes for lower 
order derivatives, and even Hu{x). 

The situation is different for discrete functions defined onMd for fixed h > 0, 
since we cannot take limits. However, the particular geometry of Q makes it 
possible (as we did) to consider for x £ dM.h as many finite differences as we 
can. For instance, A 2 ^ u(x) is well defined for x £ dM.h as long as x±hei £ M.h- 

Pushing the definitions a little further, we may define the discrete Hessian 
Hhu(x) for x £ dMh, by including as many second derivatives as we can. For 
example, if d = 3 and h = 1/2, we can define 



H h u(x) 



A Mi«0) A M2"(£) 



if x= (1/2,1/2,0), 



A M2«W A 1,22U(X) 

H h u(x) = [A 2 M1 u(x)} if x = (1/2, 0, 0), 

leaving Hhu(x) undefined if x = (0, 0, 0). 

As the reader may verify, our previous results involving Hh remain valid 
with this interpretation of the discrete Hessian. 

5 Approximating functionals 

We are in position now to use finite difference approximations of a wide class of 
optimization problems on convex functions. 

Let us describe this technique by assuming, for instance, that (V, \\-\\ v ) is a 
Banach space of real valued functions on Q, the functional 

J(v)= / F(x,v(x),Wv(x))dx 
JQ 

is defined and continuous on V, and we are interested in the optimization prob- 
lem 

inf{J(w) : v£C}, (5.1) 
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where C is a family of convex functions, C C V . 

If the functions in C may be approximated by convex functions in C 3 (IV, 



then using theorem 4.2 it may be not too difficult to define for each ft, > (or 
a sequence converging to of such /i's), a family Ch, Ch C hih, and a functional 
Jh defined on Ch, such that: 

1. HhVh(x) >z for all Vh G C^ and x G .M/i, 

2. for any v £ C and any e > 0, there exists h > and % G C/j such that 
\Jh(v h ) ~ J(v)\ <£. 



Condition 2 immediately implies that 



(5.2) 



in£{J(v) : v G C} > inf inf {J/j(^) : % G C h }. 

To prove the converse, we observe that 

inf {Jh(vh) ■ v h G C h } = inf hxi{J h (v h ) : v h G C h , K }, 

h h,K>0 

where Chj< = Ch C\ A h K . Keeping K > fixed, we may find a sequence S 
(ujf), with u^ G Ch.K such that 

J h{uh) I inf inf {Jfc(v/,) : v h G C^^}, 



,A" 



and using theorem 4.5 a subsequence S' = (m^/) and a convex function u G C 
with 



u IN converging to 0. 



If Vh and J/j are such that 

/eC and J h {uh) -» J(w K ), 
letting if-* oo we will have 

inf {J(v) : u G C} < inf ini{J h (v h ) : v h G C^}. 

h 



(5.3) 



proximations of the problem (5.1 1 



Putting together the inequalities (5.2 1 and (5.3 1, we will have discrete ap- 



Let us give some more concrete examples. Suppose, for instance that V = H 1 
is the set of functions u : Q — > R with finite norm 



\Wu(x)\ 2 + \u(x)\ 2 dx) 



1/2 



and suppose C is the set of all convex functions in H 1 . Given / G V we would 
like to find its projection on C, that is, find u G C such that 

\\u- f\\ =min||u-/||, 

vEC 

and, getting rid of the square roots, we may set 

J(v) = ]\v-ff. 

In this example we actually have a unique minimum, since the norm is strictly 
convex. We may consider then Ch as the set of discrete functions Vh G Uh with 
HhVh(x) y for all x G Mh- 
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Assuming for simplicity that f E C 1 , for Vh € Uh we define 

d 
4K) = h d ( £ (M*) - /(s)| 2 + X> M «/,(*) - 9,/( 



*)| 2 ^ 



KGWlh 



Then, it is easy to see that ( 5.2 1 and ( 5.3 1 hold. In fact, the convex functions 
of V may be approximated by C 6 convex functions, so that, given e > there 
exists v G C 3 n H 1 such that 

ii 1 
\\ u -v\\ < 2 £ ' 



and use theorem 4.2 to find h small enough and V] x G Ch such that 



\Jh(v h ) - J(v)\ <-e. 
Thus we will have 

| AM - J{u)\ < \J h {v h ) - J{v)\ + \J{v) - J{u)\ < s, 
for some Vh G Ch- 

6 Numerical results 

In this section we illustrate the behavior of the numerical scheme by applying it 
to the problems mentioned in the introduction, namely, the monopolist problem, 
norm projections on the set of convex functions, and fitness of data by discrete 
convex functions. 

In all of the following examples, we associate with each mesh node P&, k = 
1,...,N, the unknown value Uk, and the square Qk — Q(~)Qh(Pk)> of area \Qk\- 
We also consider the discrete Hessian Hh(Pk) as discussed in the remark [4~9J 
i.e., imposing convex constraints on the boundary whenever they make sense. 



Even though theorem 4.5 requires us to impose upper bounds on the second 
order differences in order to ensure convergence, the following numerical exper- 
iments were carried on without this requirement. Convergence was nevertheless 
observed at optimal rates. 

The times reported correspond to the experiments being run on a PC, with a 
2.8GHz Pentium IV processor and 2GB of RAM, running Linux. The matrices 
were assembled using OCTAVE [7] and the semidefinite program was solved 
using CSDP 5 2 with the default parameters. The graphics were obtained 
using Mathematica [2"T] . 

6.1 The monopolist problem 

Since this problem is not widely known in the mathematics community, let us 
start by giving a brief description of it following the one given in [14] , where it is 
referred to as the revenue maximization in a multiple-good monopoly problem. 

Problem 6.1 (The monopolist problem). A seller with d different objects faces 
a single buyer, whose preferences over consumption and money transfers are 
given by U(x,p,t) — x ■ p — t, where x G [0,l] d is the vector of the buyer's 
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valuations, p £ M. d is the vector of quantities consumed for each good, and 
t € K is the monetary transfer made to the seller. The valuations x are only 
observed by the buyer, and a density function f(x) represents the seller's belief 
on the buyer's private information x. The seller's problem is to design a revenue 
maximizing mechanism to carry out the sale, and it is enough to consider only 
direct revelation mechanisms: the buyer must prefer to reveal its information 
truthfully (incentive compatibility) and to participate voluntarily (individual 
rationality). Under these conditions, it can be proved that the seller's problem 
may be written as 



max 



Q 



(\7u(x) ■ x — u(x)) f(x) dx, 



where Q = [0, l] d , f is a non-negative probability density function over Q, and 
C is the set of functions u satisfying 

1. u is convex, 

2. < Vu(i) < 1 for all x e Q (the gradient taken in the weak sense and 
the inequalities componentwise), and 

3. u(0) =0. 

The functional to be maximized is the seller's expected revenue. For a buyer 
of type x, the solution u to this optimization problem represents the utility re- 
ceived by her, and the i-th component of Vw denotes the probability that she will 
obtain good i. The restriction of convexity stems from incentive compatibility, 
and the condition u > from individual rationality. 

We show now some numerical results for the problem |6.1| for the special case 
/ = 1, for which analytic solutions in 2 and 3 dimensions are known, allowing 
us to judge the behavior of the discrete approximations. That is, the functional 
to be minimized is 

J{u) = / (u(x) — Vm(i) • x) dx. 

Jq 

In both 2 and 3 dimensions, the solutions are piecewise linear convex func- 
tions whose partial derivatives are either or 1. For example, for d — 2 the 
solution is 

u(x\, X2) = max{0, x\ — a, X2 — a, X\ + X2 — b}, 



where 



2 1 

n : - and &=-(4-V2), 



3 3 

and the value at the optimum is 

J(u) = ^- (6 + y/2) w 0.549201. 



In figure 6.1 we show the contour lines obtained. In (a) the analytic solution, 
and then the discrete solution for h = 1/16, 1/32 and 1/64 in, respectively, (b), 
(c) and (d) , with the contours of the analytic solution shown in a lighter gray. 
The contours are 10 ,0.1,0.2, ...,1.1 (the CSDP solution is always positive 
due to the way it is solved). 
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0.2 0.4 0.6 0.8 1 



(b) 




0.2 0.4 0.6 0.8 1 




0.2 0.4 0.6 0. 



(c) 



(d) 



Figure 6.1: Contour lines of the solutions of the economics problem in 2D: 
analytic solution (a), and discrete approximations with h = 1/16 (b), h = 1/32 
(c), h = 1/64 (d). 
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n 


h = 1/n 


Jh{Uh) 


Jh(IhU) 


Error (L°°) 


Time 


J(U)-J h (U h ) 

h 


8 


0.125 


0.5319 


0.5444 


0.0769 


0.190s 


0.14 


16 


0.0625 


0.5404 


0.5478 


0.0300 


0.990s 


0.14 


32 


0.03125 


0.5449 


0.5488 


0.0336 


17.000s 


0.14 


G4 


0.015625 


0.5470 


0.5491 


0.0174 


751.100s 


0.14 


DO 





0.5492 











Table 6.1: Comparison of the approximations to the economics problems in 
2D for different mesh sizes. The column labeled n denotes the number of sub- 
divisions of the interval [0, 1] in each direction. The number of unknowns is 

(n+1) 2 . 

We observe that the scheme introduces quite a bit of diffusion at lower resolu- 
tions, and, in particular, the heights at the point (1, 1) increase to the analytic 
solution at that point as h decreases. However, the jump in the gradients is 
well captured, especially at higher resolutions, even though we are asking for 
conditions on the discrete Hessians which are unbounded from above. 



In table 6.1 we give some comparative results at different mesh sizes. In 
this table we indicate by Jh(Ihu) the discrete functional evaluated at Ih(u), 
the interpolant in M.^ of the exact solution, and the error column refers to 
max \uh — Ih u \ {L°° error). 

It is interesting to notice that even though the solution is not smooth (u 
is only Lipschitz) the error in the L°° norm is smaller or approximately equal 
to h. In the last column of the table we show the quantity (J(u) — Jh(uh))/h 
which is approximately 0.14 for all the values of h reported, showing that even 
under such low regularity assumptions on u, the error in the functional behaves 
as 0(h). 

In three dimensions there is also a solution with partial derivatives which 
are either or 1 , of the form 

u[x, y, z) = max {0, x — a,y — a, z — a, 

x + y — b, x + z — 6, y + z — b,x + y + z — c}. 

This time the coefficients cannot be expressed in a simple form, since they 
involve roots of polynomials of high degree, and we just give numerical approx- 
imations: 

a = 0.840627, b = 1.038352, c = 1.236077, 

so that b — a = c — b, and the value of the functional is 

J(u) w 0.868405. 



In figure 6.2 we show the regions where Vu- (1, 1, 1) is 0, 1, 2 or 3 in, different 
shades of gray. We illustrate the regions with wire frames viewed from the 
positive octant in (a), and exploded views of the solid regions from the positive 
octant in (b) and the negative octant in (c). 

In table |6.2| we give some comparative results at different mesh sizes. It is 
interesting to notice here that the L°° error is not converging to zero with order 
0(h). Nevertheless, the quantity ( J(u) — Jh{uh))/h (shown in the last column) 
decreases slowly as h goes to 0, meaning that the error J(u) — Jh(uh) in the 
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(a) 



Figure 6.2: Regions where the analytic solution in 3D has constant gradient: 
with wire frames viewed from the positive octant (a), and exploded views of the 
solid regions from the positive octant (b), and the negative octant (c). 



n 


h = l/n 


Jh{uh) 


Jh(hu) 


Error (L°°) 


Time 


J(u)-Jh(uh) 
h 


4 


0.25 


0.8195 


0.8449 


0.1356 


0.21s 


0.20 


8 


0.125 


0.8484 


0.8605 


0.1281 


3.87s 


0.16 


12 


0.0833 


0.8578 


0.8647 


0.1130 


10.29s 


0.13 


1G 


0.0625 


0.8622 


0.8661 


0.1135 


7m 20s 


0.10 


20 


0.0500 


0.8648 


0.8671 


0.1177 


50m 23s 


0.07 


DO 





0.8684 











Table 6.2: Comparison of the approximations to the economics problems in 3D 
for different mesh sizes. As before, the column labeled n denotes the number of 
subdivisions of the interval [0, 1] in each direction. The number of unknowns is 
(n+1) 3 . 



functional behaves as O(h) in this range, exhibiting a similar behavior to that 
of the two dimensional case. 

Remark 6.2. Theoretically, semidefinite programs are polynomial time solvable. 
However, as the tables [6T| and [6~2| show, in practice we cannot go too far with 
the number of unknowns. 

A reasonable size for the two dimensional problems we considered is a mesh 
of about n — 40 subdivisions. 



6.2 Projections 

Carlier, Lachand-Robert and Maury [5] gave several examples of H 1 and Hq 
projections, and in this section we consider one of the functions they considered, 
namely, 

f( Xl ,x 2 ) = -(4 + hx lX 2 2 ) e- 30 ((^-i/2) 2 +fe-i/2) 2 ) for (xi) Xi) e q. 



We show the graph of the original function in figu re |6.3[ and that of the 
resulting L 1 , L 2 , L°°, H 1 and Hq projections in figure 6.4 As in the original 



article, these graphs are shown upside down. The interested reader may observe 
that our results are qualitatively different from those in [6j p. 304]. 
Using a similar function, but with more symmetries, 

g(xi,X2) — — sin (27nKi) sin (11x2) for (#1,3:2) € Q, 
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Figure 6.3: Graph of f(x 1 ,x 2 ) = -(4 + bxixf) e -30[(x 1 -i/2) 2 +( 2;2 -i/2) 2 ] ] shown 
upside down. 



we show in figure [675] that the L°° projection need not be unique. The solution 
in (b) is the one obtained by the semidefinite programming code. 

In all these examples, we used a mesh with 41 x 41 nodes. The times ranged 
from about 40 seconds for the L°° projections to about 200 seconds for the L 2 
projection. The H 1 and Hq projections took about the same time, near 140 
seconds. 



6.3 Fitting data 

As mentioned in the introduction, many problems in science are modelled via 
convex functions, raising the question of how measured data (usually non con- 
vex) can be approximated by a convex function. 

Often the fitness to data is done parametrically. For example, by assuming 
that the underlying function is a linear combination of some given polynomials, 
and minimizing over all possible parameters in a convenient norm. Even in this 
case, approximating the data by a linear combination which is also convex — but 
otherwise arbitrary — may be challenging. 

In this section we show how our numerical scheme could be used for fitting 
data on a regular mesh by discrete functions having a positive semidefinite dis- 
crete Hessian. Though the resulting discrete functions may not be extended to a 
convex function of continuous variables, as shown in example |2. 3 1 the underlying 
"true" convex function might be well captured by the discrete function. 

In the tests we show, we perturbed with random noise the values of the 
function 

f(x!,x 2 ) = {xx - 1/2) 2 + 2 (x 2 - 1/2) 2 , 



on a regular mesh of 41 x 41 nodes. The original function and the perturbed 
data are represented in figure 6.6 (a) and (b), respectivelyF] 

We chose to use a uniformly distributed noise between — e and e for each 
node, where e = 10 h (h being the mesh size). In this way we simulate some 
intrinsic measurement noise whose distribution is presumably known, and that 
the mesh has been chosen finer than the measurement noise. Needless to say, 
our choice of noise is quite arbitrary, and there are many other possibilities to 
choose from. 



1 \t must be kept in mind that, though the data is discrete, the graphic software makes its 
own interpolations for drawing surfaces and level curves. 
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L 1 projection 




L 2 projection 



L°° projection 



H 1 projection 






Hq projection 





Figure 6.4: Projections and its comparisons to the function of figure |673| shown 
upside down. 
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(a) 




(b) 




(c) 



Figure 6.5: Non-uniqueness of the L°° projection, (a) original function, (b) and 
(c) two optimal solutions. 



In figure 
L 2 and L 



6.6 



(c), (d) and (e) we show the resulting level curves for the L 1 , 
discrete projections (respectively), with the contours of the original 
unperturbed function in a lighter gray. The L 2 discrete projection may be 
considered as a variant of a least squares approximation, but the others are not 
seen as often. 



The times spent by the semidefinite code were similar to those of section 6.2 



about 170, 200 and 40 seconds (respectively). As is to be suspected, for the same 
mesh but smaller noise (e.g., taking e = h), the results are better and the times 
smaller. 

Comparing the level curves with those of the unperturbed function (fig- 
ure [6^6] (c), (d) and (e)), it is apparent that the L°° discrete projection seems 
to give the closest fit. Notice that this projection is also faster than the L 1 or 
L 2 discrete projections by a factor of 4 to 5. 

For the run shown, the maximum absolute value of the differences between 
the perturbed data and the original function on the mesh is about 0.25 (10 
times the mesh size, as designed), whereas the maximum absolute value of the 
differences between the L°° projection and the original function is 0.42, attained 
at (0,1). 

However, as can be seen in figure 6.6 (e), the L°° projection gives a good 
fit to the unperturbed function in the interior nodes, and is actually between 
±0.018 (somewhat smaller than the mesh size) for nodes of the square 0.1 < 
X\,xi < 0.9, but deteriorates near the boundary (as do the other projections). 
This is perhaps better appreciated in figure 6.6 (f), where the graph of the L°° 
discrete projection is compared to that of the unperturbed original function. 

The "overshooting" at the boundary is a typical and known phenomenon 
when fitting data by convex functions. See, for instance, the article by Meyer [T5] , 
where the one dimensional case is discussed. 



Acknowledgements 

• Our gratitude to A. Manelli for bringing the monopolist problem |6.1| to 
our attention, and for his many enlightening comments on the subject. 

• Our thanks to the anonymous referees for their comments which made the 



presentation more clear, and for suggesting the applications in section 6.3 



29 




0.0 0.2 0.4 



(e) 




Figure 6.6: Fitting a convex surface to perturbed data. Original function (a), 
perturbed data (b), level curves of the L 1 projection (c), level curves of the L 2 
projection (d), level curves of the L°° projection (e), 3D graphs of the unper- 
turbed function and L°° projection (f). 
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